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Abstract. To elucidate the molecular basis of early gastric 
cancer (EGC), the genome-wide expression pattern of cancer 
and normal tissues from 27 patients were analyzed by a 
microarray-based method. Using an integrative systematic 
bioinformatics approach, we classified the differentially 
expressed genes in EGC. Interestingly, the more highly 
expressed genes in EGC exhibited the most significant 
correlation with cell migration and metastasis. This implies 
that, even at the early stage of gastric cancer, the molecular 
properties usually observed in late-stage cancer are already 
present. Furthermore, we have found a novel association 
between the expression pattern and molecular pathways of 
EGC and estrogen receptor a (ERa)-negative breast cancer 
through cross-experimental analysis. These results provide 
new insights into the biological properties of EGC, as well as 
yielding useful basic data for the study of molecular mecha- 
nisms of EGC carcinogenesis. 

Introduction 

Gastric cancer (GC) is the fourth most common cancer and 
the second leading cause of cancer-related deaths worldwide. 
Its prevalence is particularly high in East Asia, including 
countries such as China, Japan and Korea (1). The prognosis 
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of GC depends on the stage of diagnosis, as an early gastric 
cancer (EGC) or advanced gastric cancer (AGC) (2). Despite 
the surgical advances that have improved long-term survival 
of GC patients (3,4), molecular understanding of, as well as 
novel molecular biomarkers for, the condition is still urgently 
required for EGC, as EGC may progress towards AGC (2). 

To address this, several microarray analyses in GC have 
been performed and have identified gene expression patterns 
that may be useful in the prognosis and diagnosis of the cancer 
(5,6); however, these approaches did not consider the different 
stages or subtypes of GC. Recent studies that did consider 
stage differences (2,7,8) did not reveal the multiple phenotypes 
underlining EGC, because their primary aim was to study a 
handful of gene sets, which differentiate the stage differences. 
Accordingly, we further explored the various hidden pheno- 
types, functions and pathways in EGC by using an integrative 
systematic bioinformatics approach. 

Here, we focus on molecular understanding of EGC-specific 
expression patterns gained by employing a systematic 
approach, including function and pathway, as well as cross- 
experiment analyses of 27 pairs of EGC tissues and their 
normal counterparts. Interestingly, the function and pathway 
analyses show that the upregulated genes in EGC tissues 
correlate with cell migration and metastasis, events typical of 
late-stage cancer. In addition, we propose a novel association 
between EGC and estrogen receptor a (ERa)-negative breast 
cancer that was indicated by cross-experiment analysis, and 
which enables us to identify various associated phenotypes. 

Materials and metliods 

Patients and samples. Tissue samples were prospectively 
collected from patients who underwent gastric surgery or 
gastroscopy at the National Cancer Center (NCC) Hospital 
between 2008 and 2009. All tissues were obtained according to 
the protocols approved by the Institutional Review Board, NCC 
for the human subject guideline of NCC (NCCNCS-08-127) 
that is in accordance with the principles of the Declaration of 
Helsinki. The samples were obtained by endoscopic biopsies 
from gastric cancer patients who gave informed consent to the 
protocol. The samples were stored at -80°C. The clinical and 
pathological features of the patients are listed in Table 1. 
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Table I. Clinical features for 27 patients with gastric cancer. Table H. The primer sequences used in RT-PCR. 



Characteristics 


No. of patients" 


Primer ID 


Sequence (5'^3') 


Total 


27 


MMPl-F 


CTGGAATTGGCCACAAAGTT 


Male 


20 


MMPl-R 


CCTTCTTTGGACTCACACCA 


Female 


7 


MMP3-F 


CCCTGGGTCTCTTTCACTCA 


Age at diagnosis (years) 




MMP3-R 


TCAAAGGACAAAGCAGGATC 


Range 


41-78 


MMP7-F 


CGGATGGTAGCAGTCTAGGG 


Mean ± SD 


60.3±11.2 


MMP7-R 


TGAATGGATGTTCTGCCTGA 


TNM stage* 




MMP9-F 


GGGAAGATGCTGCTGTTCA 


T classification 




MMP9-R 


TCAACTCACTCCGGGAACTC 


Tl 


17 


MMPIO-F 


GGCTCTTTCACTCAGCCAAC 


T2 


10 


MMPIO-R 


TCCCGAAGGAACAGATTTTG 


T3 


0 


MMP1 2-F 


CCTTCAGCCAGAAGAACCTG 


N classification 




MMP12-R 


ACACATTTCGCCTCTCTGCT 


NO 


13 


MMP13-F 


TTGAGCTGGACTCATTGTCG 


Nl 


10 


MMP13-R 


GGAGCCTCTCAGTCATGGAG 


N2 


2 


GAPDH-F 


TGCACCACCAACTGCTTA 


N3 


2 


GAPDH-R 


GGATGCAGGGATGATGTTC 



M classification 
MO 
Ml 

Lauren classification 
Intestinal 
Diffuse 
Mixed 
NA"^ 



27 
0 

12 
7 
6 
2 



The 27 patient samples were used in microarray analysis. ''UICC/AJCC 
6th edition. 'Not available of Lauren classification. 



RNA extraction. Total RNA was extracted from gastric 
cancer and adjacent normal tissues from EGC patients using 
TRlzol reagent (Invitrogen, Carlsbad, CA, USA), followed 
by purification of the RNA using Qiagen RNeasy mini kit 
columns (Valencia, CA, USA) according to the manufacturer's 
instructions. RNA quality was evaluated using the Agilent 
2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) 
and concentration measured by Nanodrop 1000 (Thermo 
Scientific, Wilmington, DE, USA). Only RNAs showing 
distinct 18S/28S ribosomal peak ratios of 1.5-2.0 in the 
Bioanalyzer (Agilent Technologies) and 260/280 ratios of 
1.8-2.1 in the Nanodrop (Thermo Scientific) analyses were 
accepted for further analysis. 



'oligo' package. The differentially expressed genes between 
EGC tissues and adjacent non-cancerous gastric tissues (i.e., 
the up- and downregulated genes in EGC) were filtered by a 
fold-change cut-off of 1.5 and a P-value cut-off of 0.05. 

Functional/pathway enrichment analysis and cross-exper- 
imental analysis. We downloaded a Gene Ontology (GO) 
annotation file (gene_association.goa_human) and an ontology 
file (gene_ontology_ext.obo) from www.geneontology.org, 
as recommended by the BiNGO tutorial (9). In the BiNGO 
analysis, all options, except for filtering the lEA code, were set at 
default values. The false discovery rate (FDR) cut-off was 0.05. 
DAVID v6.7 software (http://david.abcc.ncifcrf gov/) was used to 
summarize the over-representation of the KEGG pathways (10). 
The gene expression signatures of up- or downregulated genes 
in EGC were analyzed using the L2L microarray analysis tool 
(http://depts.washington.edu/121/) (1 1). 

Reverse transcription PCR. Two micrograms of total RNA 
were reverse transcribed with Superscript 111 reverse trans- 
criptase (Invitrogen). Reverse transcription PCR (RT-PCR) 
was performed using 5 ng cDNA for 1 cycle at 94°C for 2 min, 
followed by 32-35 cycles of 94°C for 20 sec, 60°C for 40 sec 
and 72°C for 30 sec, using gene-specific primers (Table 11). 
Gene expression levels were analyzed by gel electrophoresis. 



Microarray analysis and data processing. Genome-wide gene 
expression was analyzed in the 27-paired EGC tissue samples 
using Affymetrix GeneChip Human Exonl.O ST Array (Santa 
Clara, CA, USA). Target preparation and microarray processing 
procedures were carried out as described in the manufacturer's 
instructions, and raw data were deposited in the NCBl Gene 
Expression Omnibus (GSE30727). The data were prepro- 
cessed by a default robust multi-array average (RMA) method 
implemented in the Bioconductor (www.bioconductor.org) 



Hierarchical clustering. Independent additional cancer data- 
sets were obtained from NCBl GEO (www.ncbi.nlm.nih.gov/ 
geo) and EBl ArrayExpress (www.ebi.ac.uk/arrayexpress): 
GSE19536 for ERa-negative breast cancers (12), and the 
E-MTAB-62 dataset for Ewing's sarcoma, bladder cancer, 
small cell lung cancer and LNCaP prostate cancer cell lines 
(13). The up- and downregulated genes in the EGC tissues 
were compared with these 5 cancer types. We transformed 
the expression of all our EGC tissue samples, GSE19536 
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Table III. The GO biological process terms associated with genes upregulated in gastric cancer tissues, relating to wound healing, 
cell migration and cell motility. 

Corrected 



GO-ID 


P-value 


P-value'' 




If 


x/n (%) 


Description 


GO:0014910 


1 .54E-03 


1 .86E-02 


4 


14 


29 


Regulation of smooth muscle cell migration 




j,^1jj/-U / 




1 1 


AA 




Regulation of wound healing 


GO:0010595 


4.41E-03 


4.25E-02 


5 


29 


17 


Positive regulation of endothelial cell migration 


GO:0030335 


3.69E-07 


2.38E-05 


18 


116 


16 


Positive regulation of cell migration 


GO:2000147 


3.69E-07 


2.38E-05 


18 


116 


16 


Positive regulation of cell motility 


GO:0030334 


9.12E-07 


4.44E-05 


23 


190 


12 


Regulation of cell migration 


GO:2000145 


1.44E-06 


6.12E-05 


23 


195 


12 


Regulation of cell motility 


GO:0048870 


9.75E-10 


1.81E-07 


38 


330 


12 


Cell motility 


GO:0042060 


1.04E-10 


3.09E-08 


50 


485 


10 


Wound healing 



''Multiple comparison corrected P-value. The number of the input genes annotated to a certain GO term. The number of genes in the reference 
set annotated to a certain GO term. 



C-X-C dKmofcre reccpla adn/ty 











dlMMldMrnoattracUil ri 

Ownnolm bhttig 



C^MmacMy 



Gndofit()U8w repyMor acSvl^ 



iMiioraolNty 



D ReffMionaf gnnJocytemacroptuioe . 
Gfanwniv imnnpil erf pnMif aksn 
RaQMon ofNAolo mpiMM 
RBQuMon of nuMpM ipcjploilK 
PodhM ragJAm of naulnpM ^xi|iliodB 
Pee** reoJadm ct chtTTToscrre segre . 

Posiliwe ra^iaiEn of rfjp.j*-_tii ski^i-j 
RegiMonof ffncKlh nuECle cetfTotiw . 

MtcAc sprdQ etorgabon 
riaWlW Wgiiafcon of pbWet acbvafcon 
My^od eel apoptoSE 
NoutopH apofXoas 
NsUropM homeoatBSB 
RtgJMfan oT a(fe)on«cti SKTriion 
NigMlM ragJifcn of tnUasl TCAih 
^AH^ sRsctYTWrt, tvd^ of host cei 
Dtfmnsm fmpont* to proton ^< 
Gknwniir iiiMin(fiin dmloprTMnt 



eo ao 



w a» 



MootioldBtiydQOBfiMS scfcCy, 

BtoaddtMig 

Alcaiie phoGptiQlase acttfty 

Fructoe»tephoE|]haie ittolBBe 



QiBiiDnoijivmtaiM flcMy 
SoAsnliDBbanfte s^nporter acMy 
OlNbnductase acSvily. ActnQ 

Cerarriclase adMy 
Caltane onctee acMy 
Arxlogen bniiy 
Chotostaral ftraportar acMly 



Respcnse to liyctaperoxide 
Cyfedno 10 uKfew ocMng 

Fbvanori matibcfc prooMW 
Amsry atoohol mecaboicpfoosn 

Etiand oxxfabon 
Ptwnyt»o|ia ro i[l rmtibolc p io cw> 



Pa/tmngMant/tt 

Wagi o fcn of dTemoWie seoPBlon 
Akdod catebck process 

Cdo42iJiiiUjf I siTis' vansdjdion 
Oounni mMsfccic oooMt 



« n ao 



Figure 1. The GO analysis by BiNGO. (A) The upregulated molecular functions in EGG tissues. (B) The upregulated biological process in EGG tissues. (C) The 
downregulated molecular functions in EGG tissues. (D) The downregulated biological process in EGG tissues. The Information is presented as a percentage of x/n 
(x, the number of genes in a cluster annotated for a certain GO term; n, the number of genes in the reference set annotated for a certain GO term). 



and E-MTAB-62, into standard scores (z-scores), and then 
performed hierarchical clustering for the 6 cancers. 

Results 

Genome-wide expression analysis. We selected differentially 
expressed genes (i.e., up- and downregulated genes) from the 
27 pairs of EGC tissue and their adjacent normal tissue. The 
P-value cut-off of 0.05 in t-tests, and the fold-change cut-off of 



1.5 or 1/1.5 for up- and downregulated genes, respectively, was 
used for selection. We identified 556 upregulated genes and 417 
downregulated genes. The differentially expressed genes were 
then fed into function, pathway and cross-experiment analyses to 
acquire a deeper understanding of the molecular basis of EGC. 

Functional enrichment analysis. The BiNGO plug-in on the 
Cytoscape platform (http://www.cytoscape.org/) was used to 
explore the molecular function and biological processes in GO. 
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Table IV. Pathway enrichment analysis for up- and downregulated genes in gastric cancer tissues. 



Input genes 




Pathways 


Count^ 


P-value 


Upregulated 


nsaU4UoU 


cytokine-cytokine receptor interaction 


DO 




pathways 


nsaU4jlz 


ECM-receptor interaction 


1 n 
1 / 


'2 1 A Xj nT 




nsa(J41 iU, 


cell cycle 


1 A 


/I 0 1 "C A/C 

4.Z1 .b-(Jo 




nsau404u 


hematopoietic cell lineage 


1 'X 


0 /in tr n/i 
z ,4U ,l:i<-U4 




nsaU4ozU 


Toll-like receptor signaling pathway 


1 A 

14 


j.UZ.r!,-U4 




nsaU4Uoz 


chemokine signahng pathway 




"2 1 Q Xj r\/i 
j.lo.r!,-U4 




nsaU4oiu 


complement and coagulation cascades 


1 1 
11 


J .oj .rl,-U4 




nsaU4j lU 


focal adhesion 


1 n 

ly 


O nn C n'2 




nsau4ii J, 


p53 signahng pathway 


1 (\ 
lU 


0 1 1 Th ^"2 
Z,ll .rl-UJ 




nsaU4j 14 


cell adhesion molecules (CAMs) 


1 A 

14 






n sau J zzz 


small cell lung cancer 


1 n 
lU 


Q ^A T? A'2 




nsau4D /u 


leukocyte transendothelial migration 


IZ 


1 1 1 T7 no 
1,11 .rS-UZ 




nsaUjUzU 


prion diseases 


/: 
O 


1 C 1 H7 AO 

1 .J 1 .rL-Uz 




nsaU4ozi 


NOD-like receptor signaling pathway 


Q 
O 


1 ^'j T7 no 




nsaUjzUU 


pathways in cancer 


15 


O AA tr AO 

z,Uy,li-Uz 




nsauD J jz 


graft vs. host disease 


0 


o "2/1 no 
Z,34,1:L-UZ 




nsau3 J zz 


systemic lupus erythematosus 


1 n 

lU 


0 /in T7 no 

Z .4U .li-UZ 




nsauDZi V 


bladder cancer 


0 


^ 1 ^ p no 




DaaUT-l It-, 


uucyie iiieiuaia 


in 


/I QO p 09 




VicaOAA^n 
nSaUH-DJU 


naiuioi Killer ceii meuiaieu cyiuiuxiciiy 


1 1 


< ^-2 TIJ 09 




nsdu^i^z 


lysosome 


1 n 

lU 


^ 07 F 09 






JoK-o algllallllg paLllWay 


1 9 








progesLerone-meuiaieu. oocyie maiiuauon 


Q 
O 


7 1 7 F 09 


Downregulated 


nsauuyoU 


metabolism of xenobiotics by cytochrome P450 


ZZ 


'2 '2'2 Hi 1 Q 


pathways 


nsaUUVoz 


drug metabolism 


oo 
ZZ 


T OA r? 1 o 




nsaUUoiU 


retinol metabolism 


1 A 

ly 


O T/; TT 1 c 

z,/o,1i-1j 




nsaUUl4U 


steroid hormone biosynthesis 


13 


"2 ^^2 "C A A 

3 ,o3 ,ii-Uy 




nsaUUjUU 


starch and sucrose metabolism 


1 1 

11 


O AA U A"7 

Z .(JU .b-U / 




nsauuu4U 


pentose and glucuronate interconversions 


Q 
O 


J .D / .b-u / 




nsauujyu 


arachidonic acid metabolism 


1 o 
IZ 


1 QC r: n7 

3 ,y J ,b-u / 




l-^rtAAAO'3 

nsauuyoj 


drug metabohsm 


1 A 

lU 


z,oy,b-Uo 




U^.r.AAAC'J 


ascorbate and aldarate metabohsm 


/ 


C AO r? A/C 

j.Uz.b-Uo 




l,^.r,AAC A 1 

nsaUUjyi 


linoleic acid metabolism 


o 
O 


1 A/l TI? AC 

l,U4,b-Uj 




nsaUUooU 


porphyrin and chlorophyll metabohsm 


o 
O 


"2 "20 H7 AC 

3.3z.b-Uj 




nsaUUUlU 


glycolysis/ gluconeogenesis 


1 r\ 
lU 


A A/1 Xj r\< 




nsauui ju 


androgen and estrogen metabohsm 


Q 
O 


7 07 F n^ 

/ .z / .b-u J 




nsaUjizU 


PPAR signaling pathway 


1 


1 /1 1 c no 
1 ,4 1 ,b-UZ 




1 „ ^ AAA C 1 

hsaOOUD 1 


fructose and mannose metabolism 


c 

5 


1 CA "D AO 

1.59.Ji-02 




nsaUUjJU 


arginine and proline metabolism 


/: 
O 


1 no u AO 




hsa00071 


fatty acid metabolism 


5 


2.74.E-02 




hsaOOOSO 


pentose phosphate pathway 


4 


3.42.E-02 




hsa00350 


tyrosine metabohsm 


5 


3.72.E-02 




hsa00920 


sulfur metabolism 


3 


4.53. E-02 




hsa00340 


histidine metabolism 


4 


5.00.E-02 




hsa00480 


glutathione metabolism 


5 


5.54.E-02 



"Count represents the number of input genes assigned to the KEGG pathway. 



The functions of the upregulated genes of EGC tissues binding, collagen binding, and the extracellular matrix (ECM) 
were significantly associated with C-X-C and other chemo- (Fig. lA). Moreover, the biological processes involved in the 
kine -related signaling, interleukin-8 binding, growth factor upregulated genes in the EGC tissues were strongly related to 
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cell proliferation, mitosis, apoptosis and cell-matrix adhesion 
(Fig. IB). Additionally, wound healing terms, cell migration 
terms and cell motility terms were also listed in the upregulated 
genes with statistical significance (Table III). Since cell migra- 
tion, cell motility and wound healing are typically observed 
in late-stage, metastatic cancer, this may indicate that EGC 
tissues could possess intrinsic aggressiveness, despite their 
early detection. Conversely, the downregulated genes were 
strongly linked to oxidoreductase activity (e.g., oxidoreductase 
activity acting on the CH or CH2 groups, quinones) in GO 
molecular function (Fig. IC). Furthermore, the downregulated 
genes were enriched in various terms related to metabolic 
processes (e.g., flavone and flavonoid metabolic pathways) in 
GO biological processes (Fig. ID). The GO terms of the down- 
regulated genes clearly indicate dysregulation of metabolism 
in EGC, which is one of the emerging cancer hallmarks (14). 

Pathway enrichment analysis. The DAVID tool (http://david. 
abcc.ncifcrf.gov/) was used to inspect the KEGG biological 
pathways associated with the differently expressed genes in 
EGC. 

The upregulated genes in EGC tissues were intrinsically 
associated with cytokine-cytokine receptor interactions, 
ECM-receptor interactions, the cell cycle, hematopoietic cell 
lineage and Toll-like receptor signaling pathways (Table IV). In 
addition, focal adhesion and cell adhesion molecule pathways 
were highlighted. Thus, similar to the functional enrichment 
analysis, upregulated pathways in these tumor tissues suggest 
a strong potential for cell motility and metastasis, despite 
early detection. In contrast, the downregulated genes in the 
EGC tissues were strongly associated with xenobiotics-, 
drug-, retinol-, starch- and sucrose-related metabolism, steroid 
hormone biosynthesis, as well as pentose and glucuronate 
interconversion pathways (Table IV). 

The expression of MMPs in EGC tissue. Our functional and 
pathway analyses demonstrated that the significantly upregu- 
lated genes in EGC tissues are associated with cell migration 
and metastasis, events typical of late-stage cancer. To verify 
our findings, we further analyzed the expression pattern of 
matrix metalloproteinases (MMPs), which are well known cell 
migration-related genes. MMPs are also known to play critical 
roles in the regulation of cell invasion by ECM proteolysis, as 
well as by processing cytokine precursors in the chemokine 
network. 

We analyzed the expression pattern of 7 MMPs (MMPl, 
-3, -7, -9, -10, -12 and -13) within the upregulated gene data 
in EGC tissues. As expected, and consistent with the micro- 
array data where MMPs were upregulated 1.56- to 8.68-fold 
(Fig. 2A), RT-PCR indicated that MMP mRNA expression 
was highly upregulated in the patients' EGC tissues (Fig. 2B). 

Cross-experimental analysis. In order to investigate similar 
molecular signatures between EGC and other cancer types, 
we compared our data of differentially expressed genes with 
a public gene expression signature warehouse, L2L. This 
revealed that the upregulated genes in EGC most significantly 
correlated with the gene expression signature of ERa-negative 
breast cancer (Table V). As summarized in Table V, the upreg- 
ulated genes in EGC were also similar to the gene expression 
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Figure 2. The expression of MMPs in EGC tissues. (A) Expression levels of 
MMPs genes in microarray data. The vertical axis represents fold-change of 
the cancer tissues over normal tissues. (B) mRNA expression of MMPs using 
RT-PCR. Three pairs of non-cancerous and tumor tissues from the microarray 
analysis were used. (N, adjacent non-cancerous gastric tissue; T, EGC tissue). 



signature related to an undifferentiated cancer status (cancer_ 
undifferentiated_meta_up: 69 genes commonly upregulated 
in undifferentiated cancer relative to well-differentiated 
cancer, from a meta-analysis of the OncoMine gene expres- 
sion database), stemness (stemcell_embryonic_up: enriched in 
mouse embryonic stem cells, compared to differentiated brain 
and bone marrow cells) and survival (dox_resist_gastric_up: 
upregulated in gastric cancer cell lines resistant to doxorubicin, 
compared to parent chemosensitive lines). Together, the EGC 
tissues reflect various facets of cancer-related phenotypes, viz., 
strong survival, stem-like and morphology. 

The same L2L analysis was applied to the downregulated 
genes in EGC (Table V). Interestingly, epigenetic-related 
cancer gene expression signature terms (5azac_hepg2_up 
and 5azac-tsa_hepg2_up in Table V) were highly ranked. 
This suggests that global alterations in DNA methylation and 
histone modification occur in EGCs, as it does in other cancers. 

Hierarchical clustering of the EGC tissues and other cancers. 
To validate the result of the L2L analysis showing a relationship 
between EGC and ERa-negative breast cancer, we performed 
a hierarchical clustering analysis. The expression datasets of 
the differently expressed genes in EGC (556 upregulated gene 
symbols and 417 downregulated gene symbols), ERa-negative 
breast cancer and 4 additional cancers (small cell lung cancer, 
LNCaP prostate cancer cell lines, bladder cancer and Ewing 
sarcoma) were used in an unsupervised hierarchical clustering 
analysis. As in the L2L analysis, the results indicated that EGC 
correlated most closely with the ERa-negative cancer than 
with the other 4 cancers (Fig. 3). 

When we inspected the expression levels (z-scores) of the 7 
MMP genes (Fig. 4), the results indicated that the ERa-negative 
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Table VI. Comparisons with other GC sources In terms of 
upregulated EGC-related genes. 



Refs. 



EGC 
classification 



n" 



Significance 
^ (P-value)'= 



(2) 
(7) 



(8) 



488 

Well-dlfferentlated (WD) 170 
and moderately 
differentiated (MD) 
AJCC staging I and II 118 
(TNM staging) 



83 
62 



<2.2E-16 
<2.2E-16 



15 1.601E-06 



"The number of upregulated genes in the cancer according to the 
references. ""The number of common upregulated genes (intersection) 
between the references and ours. 'The significance of the intersections 
between our EGC upregulated genes and the studies were calculated. 
Fisher's exact test, based on the randomization model, was used to 
obtain the P-values of the intersections from a hyper-geometric dis- 
tribution. The smaller the P-value, the more significant the agreement 
between the previous study and our EGC study. The total number of 
gene symbols used in the Fisher's exact test is 19,211 (HUGO Gene 
Nomenclature Committee). 



cancer, above all other observed cancers, showed the most 
similar expression patterns for the 7 MMPs. Overall, the hier- 
archical clustering was consistent with the cross-experimental 
analysis and strongly supported the molecular similarity 
between EGC and ERa-negatlve breast cancer In terms of 
carcinogenesis. 

Discussion 

We analyzed the mlcroarray data generated from pairs of tumor 
tissue and their adjacent non-cancerous tissue, obtained from 
27 EGC patients. The gene expression data were subjected to 
functional and pathway analyses, as well as gene expression 
signature comparison (cross-experiment analysis). This led 
to 2 novel findings: 1) the functional and pathway analyses 
suggested that metastasis-related biological processes may 
already be highly expressed even In the early stage of gastric 
cancer, and 11) the gene expression pattern of EGC Is closely 
aligned to that of ERa-negatlve breast cancer. 

We also compared the differentially expressed genes In our 
EGC tissues with other 3 previously published gene expression 
studies (2,7,8). We found that the upregulated genes In our study 
significantly overlapped with the upregulated genes In the EGC 
groups of the 3 earlier studies (2,7,8), under a randomization 
model (Table VI). Recently, Vecchl et al suggested a carcino- 
genesis model (2) In which the transition from normal mucosa 
to EGC is accompanied by cell cycle upregulatlon; our pathway 
analysis results (hsa04110, cell cycle in Table IV) is consistent 
with this model. Interestingly, AGC functions (cell mlgratlon- 
and ECM-related functions), suggested by the Vecchl model 
were also revealed in our EGC data, again indicating that 
EGC actually harbors gene expression events that are usually 
observed in the later stages of cancer, such as AGC. 

Based on our functional and pathway analyses, the 
upregulated genes in the EGC tissues were highly enriched for 
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Figure 3 . Hierarchical clustering. Genes up- or downregulated in EGC as well as 5 independent cancer types were used in the hierarchical clustering analysis. 
Each cancer type is presented with the following column side-bars: EGC (brown), ERa-negative breast cancers (orange), bladder cancer (grey), Ewing sarcoma 
(black), small cell lung cancers (yellow) and LNCaP prostate cancer cell lines (blue). Seven MMP genes are presented with row side-bars. 



ABC 



MMP1 MMP3 MMP7 




Figure 4. The expressions (Z-scores) of 7 MMP genes. Z-score zero corresponds to the mean of all expressions in each cancer type. (EGC, the EGC tissues; ERa 
(-), ERa-negative breast cancer; Lung, lung small cell cancer; LNCaP, LNCaP prostate cancer cell lines; Bladder, bladder cancer; and Ewing, Ewing sarcoma). 



genes involved in cell proliferation, chemokine/growth factor 
signaling and cell migration. The computational implication 
is, in fact, closely related to MMP activity, as MMP substrates 
include growth factor/chemokine precursors and E-cadherin 



(15,16). We validated the upregulation of the 7 MMPs in the 
EGC tissues by RT-PCR. This result suggests that the activa- 
tion of multiple MMPs may be involved in the early stage of 
cancer. The suggestion is noteworthy, when considering that 
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Figure 5. The upregulated genes belonging to the KEGG TLR signaling 
pathway (A) and cell cycle pathway (B) in our EGC tissues. 



the roles of multiple MMPs were mainly reported in late- 
stage gastric cancer (2,17). It is also interesting to note that 6 
(MMPl, -3, -7, -10, -12 and -13) of the 7 MMPs are clustered at 
llq22, implying that epigenetic events could be involved in the 
upregulation of the clustered MMPs (18). 

Additionally, we found that the gene expression pattern 
in EGC tissues resembles the pattern of the ERa-negative 
breast cancer transcriptome. Since ERa-negative breast cancer 
clusters with EGC (Fig. 3), the similarity suggests that these 
two cancers may share common molecular features. Recent 
breast cancer studies (19,20) reported that high expression of 
cyclooygenase-2 (Cox-2), encoded by PTGS2, is associated 
with poor survival in ERa-negative breast cancer patients, 
when compared to ERa-positive breast cancers. Interestingly, 
Cox-2 is highly involved in the inflammation-associated 
carcinogenesis of the gastrointestinal tract. In particular, 
//.py tori-infected gastric epithelial cells can experience malig- 
nant transformation via Toll-like receptor (TLR) signaling that 
induces Cox-2, followed by activation of cell proliferation (21). 
In fact, our pathway analysis in EGC showed upregulation of 
the KEGG TLR signaling pathway and cell cycle pathway 
(Table IV, Fig. 5). Our EGC also showed a markedly increased 
expression of PTGS2 (5 .74 -fold- change). Thus, the similarity 
between EGC and ERa-negative breast cancer may come 
from identical subsets of immune response-related signaling 
between the microenvironments of the tumors. 

In conclusion, we have analyzed the differentially expressed 
genes in EGC patients using an integrative systematic approach. 
We found that genes highly expressed in EGC are involved 
in cell migration- and metastasis-related functions typically 
observed in late-stage cancer. Also, EGC may be intrinsically 
similar to ERa-negative breast cancer, by sharing immune- 
related signaling events, which is further dissected in both 
cancer types. The functional roles of the downregulated genes 
in EGC carcinogenesis remain to be elucidated in future. 
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